Ensembles of Optimum-Path Forest Classifiers Using Input Data Manipulation and Undersampling

نویسندگان

  • Moacir P. Ponti
  • Isadora Rossi
چکیده

The combination of multiple classifiers was proven to be useful in many applications to improve the classification task and stabilize results. In this paper we used the Optimum-Path Forest (OPF) classifier to investigate input data manipulation techniques in order to use less data from the training set without hampering the classification accuracy. The data undersampling can be useful to speed-up the classification task, and could be specially useful with large datasets. The results indicate that the OPF-based ensemble methods allow a significant reduction on the size of the training set, while maintaining or slightly improving accuracy. We provide intuition for a case of failure and report the results of synthetic and real datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Markov Random Field Model for Combining Optimum-Path Forest Classifiers Using Decision Graphs and Game Strategy Approach

The research on multiple classifiers systems includes the creation of an ensemble of classifiers and the proper combination of the decisions. In order to combine the decisions given by classifiers, methods related to fixed rules and decision templates are often used. Therefore, the influence and relationship between classifier decisions are often not considered in the combination schemes. In th...

متن کامل

Improvement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination

Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...

متن کامل

Supervised Pattern Classification Using Optimum-Path Forest

We present a graph-based framework for pattern recognition, called Optimum-Path Forest (OPF), and describe one of its classifiers developed for the supervised learning case. This classifier does not require parameters and can handle some overlapping among multiple classes with arbitrary shapes. The method reduces the pattern recognition problem into the computation of an optimum-path forest in ...

متن کامل

Land Use Classification Using Optimum-Path Forest

It was introduced in this paper the Optimum-Path Forest for land use classification aiming a better environmental management, using images obtained from CBERS 2B CCD satellite covering the area of the Rio das Pedras watershed, Itatinga City, São Paulo State, Brazil. We also compared the Optimum-Path Forest algorithm with the well known supervised classifiers: Artificial Neural Networks using Mu...

متن کامل

Using Model Trees and Their Ensembles for Imbalanced Data

Model trees are decision trees with linear regression functions at the leaves. Although originally proposed for regression, they have also been applied successfully in classification problems. This paper studies their performance for imbalanced problems. These trees give better results that standard decision trees (J48, based on C4.5) and decision trees specific for imbalanced data (CCPDT: Clas...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013